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Abstract 

e-Valuate is a game on arithmetic expressions. The players have con- 
trasting roles of maximizing and minimizing the given expression. The 
maximizcr proposes values and the minimizer substitutes them for vari- 
ables of his choice. When the expression is fully instantiated, its value is 
compared with a certain minimax value that would result if the players 
played to their optimal strategies. The winner is declared based on this 
comparison. 

We use a game tree to represent the state of the game and show how 
the minimax value can be computed efficiently using backward induction 
and alpha-beta pruning. The efficacy of alpha-beta pruning depends on 
the order in which the nodes are evaluated. Further improvements can be 
obtained by using transposition tables to prevent reevaluation of the same 
nodes. We propose a heuristic for node ordering. We show how the use 
of the heuristic and transposition tables lead to improved performance by 
comparing the number of nodes pruned by each method. 

Keywords: Arithmetic expressions, game trees, alpha-beta pruning 

1 Introduction 

Given an arithmetic expression E involving variables and the standard operators 
(+, — , * and /), players Amogha and Dhruva evaluate E with contrasting goals; 
Amogha would like to maximize E while Dhruva would like to minimize E. 
Towards this end, they take turns to instantiate the variables. Amogha starts 
and, at each move, proposes a value (digit 0-9) and Dhruva substitutes the 
value for a variable of his choice. When the expression is fully instantiated, it is 
evaluated and compared with a certain minimax value that would result if the 
players played to their optimal strategies. Let val(E) be the value of E at the 
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end of the game and minimax(E) be the minimax value. The winner is then 
determined in the following way. 

o If val(E) > minimax(E,) then Amogha is declared the winner. 

o If val(E) < minimax(E) then Dhruva is declared the winner. 

o If val(E) = minimax(E), then the game is a draw. 

For example, if E = X * (Y — Z), a possible sequence of moves is 

1. Amogha chooses 5 and Dhruva replaces X with 5 so that E = 5 * (Y — Z). 

2. Next Amogha chooses 3 that Dhruva substitutes 3 for Z leading to E = 
5*(F-3). 

3. Finally Amogha chooses 9 which Dhruva substitutes for the remaining 
variable Y and the final value for the expression is 5(9 — 3) = 30. 

With more strategic play from either player, the expression is evaluated differ- 
ently. For instance, with the same moves from Amogha and optimal play from 
Dhruva, the substitutions would be 5 — ► Y, 3 — > X and 9 — > Z and the expres- 
sion evaluates to —12. With optimal play from both players, a possible sequence 
of moves is 6 — > Y", 3 — > X and — > Z with E evaluating to the minimax value 
18. 

We will refer to this version of the game as e- Valuate. Specific instances of 
the game have appeared in books on mathematical puzzles. For example, in [9], 
the expression is a difference of two four digit numbers and the reader is asked 
to find the minimax value. 

Some possible variations on this form of the game are the following. 

o The expression as well as the domain can be generalized. For example, 
other mathematical operators can be introduced in the expression and 
the domain can include other values over which the expression can be 
evaluated. 

o An alternate way of playing the game is for the players to switch roles 
at the end of the game and reevaluate the expression. If the expression 
evaluates to a larger value in one of the games then the maximizer in that 
game is the winner. This version could be applicable when the number of 
variables is large enough that computing the minimax value is infeasible. 

Another variant is for the first player to take on the role of the minimizer and 
the second player that of the maximizer. This is however equivalent to the 
original version since min(E) = — max(— E) and max(E) = — min(— E) where 
the minimum and maximum are carried out over the domain of the variables. 
Thus, the final value under optimal play from both players is — minimax (—K). 
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Minimax is a more general term and applies to any two player zero-sum game 
[5] . By using a game tree to represent the states of the game and the moves of 
the players, the minimax algorithm can be used to determine the best move at 
each position in the game in the following manner. First values are assigned to 
the leaf nodes using an evaluation function. Next, the players MAX and MIN 
attempt to maximize and minimize the value of the nodes corresponding to their 
turn of play. For an intermediate node that corresponds to MAX's turn to play, 
the value of the node is the maximum of the values of its children. Similarly, for 
an intermediate node that corresponds to MIN's turn to play, the value of the 
node is the minimum of the values of its children. The value at the root is the 
minimax value of the game. For example, if a game is designed such that under 
optimal play, MIN has a winning strategy, and the leaf nodes are assigned a 
value of +1 or —1 according to whether the corresponding position is a win for 
MAX or MIN, then the minimax value will be —1. 

Several optimizations to this method of computing the minimax value have 
been studied [H [5] . Some well known techniques are 

o Alpha-beta pruning: This is a windowing procedure that starts with an 
interval of (-co, +oo) for the minimax value. As nodes are evaluated, 
the window shrinks and any node that evaluates to a value outside this 
window is pruned along with the subtree rooted at that node. 

o Negascout: Negascout [7] works by assuming that for each node, the first 
child will be in the principal variation (the sequence of moves leading 
to the minimax value). It uses a null search window for the remaining 
children and on failure, uses a full search window. Thus this method is 
most effective when there is a good ordering for evaluating the nodes. 

o Transposition tables: This is a memoization technique where the values 
of nodes that are evaluated are stored and retrieved when another node 
that corresponds to the same game position has to be evaluated. This 
effectively prunes the subtree rooted at that node. 

The computational challenge in e-Valuate is an efficient way of determin- 
ing the minimax value in order to identify the winner. We show how these 
techniques lead to more efficient ways of determining minimax (E). 

In the next section, we introduce the game tree for e-Valuate and show how 
the minimax value can be computed using backward induction. In Section [3j we 
show how improved performance can be obtained by combining the minimax 
algorithm with alpha-beta pruning. We describe these methods in the context 
of our game. The efficacy of alpha-beta pruning methods depends on the order 
in which the the children of each node are evaluated. We describe a heuristic 
for determining this order. Further improvements can be obtained by avoiding 
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repeated reevaluation of the same game position through the use of transposi- 
tion tables. In Section [4j we provide implementation details and compare the 
number of nodes pruned by the two methods, alpha-beta and alpha-beta with 
node ordering and transposition tables, for different arithmetic expressions. We 
conclude with some unanswered questions related to this game. 

We fix some notations. For an arithmetic expression E, let n be the number 
of variables in E. E(i — > X) denotes the expression E with variable X replaced 
by i. The players MAX and MIN will denote the maximizer and minimizer 
respectively. 

For general aspects of game theory, see [5]; [3] is a useful online resource 
for lectures, glossary of terms and articles related to game theory. The game 
algorithms we have outlined above are well documented in books on artificial 
intelligence (e.g. [1], [S])- 

2 The Game Tree for e-Valuate 

In the framework of game theory e-Valuate can be classified as a finite, sequen- 
tial, two person game with perfect information. It is finite as the game ends 
after a finite number of moves, sequential since the players take turns in making 
their moves (rather than move simultaneously as in the rocks, paper and scis- 
sors game) and it's a game of perfect information as each player is aware of the 
other's moves at any point in the game. 

Sequential games with perfect information can be represented using a game 
tree. The root of the tree corresponds to the initial configuration of the game 
(in our case, the expression E) and the edges represent possible moves that 
the players make. Each node in the tree represents a position in the game. 
The root and the leaf nodes are MAX nodes and the nodes at intermediate 
levels are alternately MAX and MIN nodes and represent positions where the 
maximizer or minimizer has to make a move. Thus each MAX node has 10 
children corresponding to 10 possible moves (choosing any digit). A MIN node 
at a height d has (d+ l)/2 children that correspond to (d+ l)/2 uninstantiated 
variables. The height of the tree is 2n. We will denote by iree(E), the game 
tree corresponding to E. 

The number of nodes in the tree, T(n), depends only on n and satisfies the 
recursion 

T(n) = 11 + 10nT(n - 1) (1) 

which follows from observing that the root node has 10 children each of which 
has n children that correspond to game trees on expressions with (n — 1) vari- 
ables. 
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We can use this to bound T(n) by 



2n!10™ < T(n) < 2n!10"e 1/10 

from the following argument. Let N = n!lO™ be the number of leaves of £ree(E). 
Starting from the bottom and counting the number of nodes at each level we 
get 

T(n) = iV + iV/l + 7V/(l*10)+iV/(l*10*2) + --- + iV/(7i!lO n ) 

n n n 

= ^7(^10*) + N /( i ^^ 1 ) < 2 Y1 N /( mQi ) < 2e 1/10 iV 

i=0 i=l i=0 

as desired. 

We identify each node in a tree by 

o a sequence of instantiations of the variables and possibly an additional 
digit (for a MIN node). For example if E = (10 -X)*Y, then a MAX 
node in tree(E) is {1 -> Y} and a MIN node is {1 -» Y,3}. Thus MAX 
nodes correspond to partially instantiated expressions and MIN nodes to 
(expression, digit) pairs. 

o a value which is the minimax value of the partially instantiated expression 
for a MAX node and the minimum of the minimax values of the children 
for a MIN node. This is the value E would evaluate to under optimal play 
starting from the position given by the node. This is also referred to as 
the score of the position given by the node [I]. 

The game tree tree ((10 — X) * Y) is shown partially in Figure [2] The edges are 
labelled by the moves corresponding to the players. 

The minimax value is computed by the method of backward induction ap- 
plied to tree (Ei). This procedure works by reasoning backwards from the end of 
the game and computing the optimal move for the players at each position. At 
a terminal (MAX) node, the expression is a constant, and the value of the node 
is this constant. Working up, each MIN node has as its value, the minimum of 
the values of its valid children and each MAX node, the maximum of the values 
of its valid childrerrl The value of the root is minimax (E). 



3 Alpha-beta Pruning and Node Ordering 

To determine minimax(E), it's not necessary to evaluate every node in the tree. 
Suppose alpha is the current maximum (over the children evaluated so far) for 

'An internal node is deemed valid if it has a valid child. A terminal node is valid if it 
evaluates to a finite value 
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Figure 1: A partial game tree for E = (10 — X) * Y 



a MAX node and beta the current minimum for a MIN node. For a MAX node, 
if its alpha value is at least the beta value of its parent, then there is no reason 
to explore the node further as the final value of its parent will be smaller than 
alpha. Each pruning of a subtree of a MAX node in this manner is referred to 
as a beta cutoff. Similarly, for a MIN node, if its beta value is at most the alpha 
value of its parent then the remaining subtrees of this node can be pruned as 
the final value of its parent will be larger than beta. These prunings arc alpha 
cutoffs. 

A subtree at height d that is pruned by an alpha cutoff is rooted at a MAX 
node and prunes T(d/2) nodes. A subtree at height d that is pruned by a beta 
cutoff is rooted at a MIN node and prunes (T(\d/2\) — 1)/10 nodes. 

For example, suppose E = (10 — X)*Y. Then minimax(E) = 45 and a termi- 
nal node that achieves this value is {5 — > X, 9 — > Y}. To compute minimax(E), 
we start at the root node and evaluate the MIN nodes {0}, {1}, . . . , {5} in suc- 
cession which return the values 0,10,20,30,40 and 45 respectively. At this 
point, the alpha value at the root is max(0, 10, 20, 30, 40, 45) = 45. When node 
{6} is explored, the MIN node computes the value of the MAX node {6 — > X} 
which returns 36 as its minimax value. Thus the beta value of {6} is 36 which 
is smaller than 45, the alpha value of its parent. As a result, the node {6 — >• Y} 
is not evaluated. Similarly, the nodes {7 — > Y}, {8 — > Y} and {9 — > Y} are not 
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evaluated leading to 4 alpha cutoffs. 

The pseudocode for computing the minimax value of E with alpha-beta prun- 
ing is given by Algorithm [I] The function alphabetaQ takes as its parameters, 
the current node, its height, the current value (alpha or beta) of the parent node, 
the digit passed (valid for a MIN node) and the current player. Apart from the 
minimax value, the algorithm also returns the number of nodes pruned by alpha 
and beta cutoffs, which are computed using the recursion formula 0, as well 
as the entire principal variation. The function is called with the command 
alpha_prunes = beta_prunes = 0; principal_var = ''>' 
alphabeta (root, 2n, oo , — 1, MAX) 

3.1 A Heuristic for Node Ordering 

The effectiveness of alpha-beta pruning depends on the order in which each 
node's children are explored. For example, for the expression E = (10 — X) * Y, 
suppose we evaluate a MAX node by choosing the digits in sequence {5, 4, 6, 3, 
7,2,8,1,9,0}, and evaluate a MIN node by setting the variable sequence as 
(Y, X) if the digit passed to it is less than 5 and as (X, Y) if the digit passed to 
it is at least 5. Then, to calculate minimax{E), the node {5} is evaluated first 
and returns 45. Subsequently, for each of the MIN nodes, the order in which its 
children are explored ensures that there is an alpha cutoff. 

Let v(x) be an estimate for the value v(x) of node x. We propose a heuristic 
for determining the order in which the digits are to be chosen at a MAX node. 
The ordering is static in the sense that it is determined by E and is the same for 
all nodes being evaluated. We estimate the values of the MAX nodes 2 levels 
below the root node. These estimates are backed up, by taking the minima, to 
estimate the values of their parents. If these estimates are placed in decreasing 
order, as v({io}) > v({ii}) > ■ ■ ■ > v({ig}) then the children of a MAX node 
are evaluated in sequence iq, i\, . . . ,ig. 

For a MAX node x = {i — > X}, our estimate for v(x) is simply the maximum 
of E over some random instantiations of the variables of E while fixing X at i. 
More precisely, to estimate v({i — > X}), we fix X at i and randomly instantiate 
the other variables in E with digits and compute val(E). We do this a fixed 
number of times and take the maximum of the resulting values. 

The performance of minimax algorithm is further enhanced by noting that 
several nodes in free(E) correspond to the same game position and thus have 
to to be evaluated only once. An example are nodes {2 — > X, 1 — > Y} and 
{1 — > Y, 2 — > X} in tree(X * (Y — Z)). We exploit this fact by storing, for each 
MAX node x that is fully evaluated (i.e. none of its children are pruned by 
beta cutoffs), its value and the principal variation starting at x. On subsequent 
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Algorithm 1 Minimax value of E with alpha-beta pruning 
function alphabeta(node, height, parent_a/3, digit, player) 

if height = > terminal node 

principaLvar = "" 
return the value of node 
if player = MAX > process MAX node 

maxstr = "" > principal variation from this node 

maxval = — oo > current a 

for each i from to 9 > evaluate each child in this loop 

value = alphabeta (node, height —1, maxval, i, MIN) 
if value > maxval and value ^ +oo > update a 

maxval = value 
maxstr = principaLvar + 'i' 

if maxval > parcnt_a/3 > beta prune 

beta.prunes = beta_prunes + (9 — i) * (T(hcight/2) — 1)/10 
break 

principaLvar = maxstr 
return maxval 
else 

minstr = "" 

minval = +oo > current f) 

j = (height + l)/2 > number of children left to explore 

for each uninstantiated variable v in node 
j = j - 1 

grandchild = node (digit — > v ) > replace v by digit in node 

value = alphabeta (grandchild, height — 1, minval, —1, MAX) 

if value < minval and value 7^—00 > update (3 

minval = value 

minstr = principaLvar + V 
if minval < parent_a/3 > alpha prune 

alpha_pruncs = alpha_prunes + j * T((hcight — l)/2) 

break 

principaLvar = minstr 
return minval 
end function 
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E 


minimax(E) 
and principal 
variation 


a-/3: No. of 
nodes pruned 


a-/? with node ordering 
and transposition tables 


Digit order 


No. of nodes 
pruned 


x i 2*y z 

y z x 


16/3, 3 ^ x, 
3 -> z, 9 -> y 


4515 


1,4,3,5,6, 
7,2,8,9,0 


7526 


w — 3*52 + 3 * x 


21, 6 w,7 -> 
y, 5 — J- x, —> z 


302271 


9,8,7,6,5, 
4,3,2,1,0 


464162 


v + w + x — y — z 


12, 7 ->• y,8 
u, 7 — ► w, 4 — > 
a;, — > z 


w 2.18* 10 8 


7,3,4,5,6, 
1,2,8,9,0 


w 2.53 *10 8 


a-\-b _i_ d+e 
c + / 


7.6, 3 -> a, 3 

c, 5 /, 9 -> 6, 

9 ->• d, 9 ->• e 


w 1.33* 10 10 


5,4,3,2,8, 
1,9,0,6,7 


«1.55*10 10 



Table 1: Comparison of Alpha-beta and Alpha-beta with Node Ordering 



visits to nodes that correspond to the same game position, this value is retrieved 
instead of being recomputed. 

4 Implementation Details 

We first convert E to a postfix form using Dijkstra's shunting yard algorithm 
2 . During evaluation, the variables are substituted with values, and val(E) is 
computed using the reverse polish notation evaluation 1 algorithm. 

Table [T] compares the number of nodes pruned by alpha-beta and alpha-beta 
with node ordering and also shows the ordering of digits at each MAX node as 
determined by the heuristic. For the alpha-beta method, the number of nodes 
pruned is the sum of the number of nodes pruned by alpha and beta cutoffs. 
For alpha-beta with node ordering, the number of nodes pruned is the sum of 
the number of nodes pruned by alpha and beta cutoffs and the transposition 
tables. For expressions with five or six variables, we have observed a ten-fold 
speedup in the performance of the second method over the first. 

We also attempted ordering the MIN nodes as well as using different or- 
derings for MAX nodes at different heights using the same heuristic but any 
gains in the number of nodes pruned was offset by the computational time in 
determining the order. Other promising approaches such as Negascout [7] and 
the MTD-/ [5] algorithm have not been attempted yet. 
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5 Conclusion 



We have demonstrated the effectiveness of search algorithms for computing the 
minimax value of e-Valuate. Other heuristics for node values could yield more 
effective ordering of the nodes and thus faster algorithms. 

One would also like to understand what expressions and associated domains 
constitute a fair game. A fair game is one where if MAX and MIN make their 
moves randomly, they have equal chances of winning. For example, if E has only 
+ and * operators or is defined on one variable, then minimax (E) = max(E) 
and MIN can never lose. 
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